(N 



q 

en 

o 

(N 



Update of Correlation Analysis between Active Galactic Nuclei 
and Ultra-High Energy Cosmic Rays 

Hang Bae Kiiro and Jihyun KinQ 

Department of Physics and The Research Institute of Natural Science 
Hanyang University, Seoul 133-791, Korea 



O 

<N ■ Abstract 

We update the previous analysis of correlation between ultra-high energy cosmic rays (UHECR) 

and active galactic nuclei (AGN), using 69 UHECR events with energy E > 55EeV released in 

2010 by Pierre Auger observatory and 862 AGN within the distance d < 100 Mpc listed in the 13th 

ryj edition of Veron-Cetty and Veron AGN catalog. To make the test hypothesis definite, we use the 

simple AGN source model in which UHECR are originated both from AGN, with the fraction /a, 

Q.I, and from the isotropic background. We treat all AGN as equal sources of UHECR, and introduce 

6 

j^ ■ arrival direction distributions observed by PAO and expected from the model by the correlational 



the smearing angle 9 S to incorporate the effects of intervening magnetic fields. We compare the 



angular distance distribution (CADD) and the flux-exposure value distribution (FEVD) methods. 
Both CADD and FEVD methods rule out the AGN dominance model with a small smearing angle 



> 

rr\ {Ia ^ 0.7 and 9 S < 6°). Concerning the isotropy, CADD shows that the distribution of PAO data 



is marginally consistent with isotropy. The best fit model lies around the AGN fraction f^_ = 0.4 
and the moderate smearing angle 6 S = 10°. For the fiducial value $a = 0.7, the best probability 
of CADD was obtained at a rather large smearing angle 9 S = 46°. Our results imply that for the 
whole AGN to be viable sources of UHECR, either an appreciable amount of additional isotropic 



background or the large smearing effect is required. Thus, we try to bin the distance range of 
AGN to narrow down the UHECR sources and found that the AGN residing in the distance range 
60 — 80 Mpc have good correlation with the updated PAO data. It is an indication that further 
study on the subclass of AGN as the UHECR source may be quite interesting. 
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I. INTRODUCTION 



The recent confirmation of the Greisen-Zatsepin-Kuzmin (GZK) suppression in the cosmic 
ray energy spectrum l|, |2| indicates that the ultra-high energy cosmic rays (UHECR) with 
energies above the GZK cutoff, -Egzk ~ 40EeV (fEeV = f0 18 eV), mostly come from 
relatively close (within the GZK radius, tgzk ~ 100 Mpc) extragalactic sources. However, 
the identification of the UHECR sources is far from clear. Recent efforts to identify the 
sources are based on the belief that the intergalactic magnetic fields are not so strong that 
they don't alter significantly the trajectories of UHECR with these highest energy and thus 
the arrival directions of UHECR keep some correlations with the source distribution. An 
important step toward this direction is to check the correlation between the UHECR arrival 
directions and the large scale structures manifested in the galaxy distribution. It was studied 



by several groups pMlOJ and the results are not quite conclusive yet. The positive result 
will provide the basis for the further study of correlations between the UHECR and specific 
classes of astrophysical objects. Another important progress toward this direction was the 
correlation between arrival directions of UHECR and nearby active galactic nuclei (AGN) 
reported by the Pierre Auger Observatory (RAO) J3j. Though further analysis with more 
data weakened the significance of the correlation [4|, |5|, it still remains as an important issue. 

The correlation between the UHECR arrival directions and the astrophysical objects 



has been studied in many ways 



a liJIia La-lis 



The reason why we have to rely on 



the statistical methods are that the poor understanding of the intergalactic magnetic fields 
makes the exact identification of the source of each UHECR difficult and that the number 
of observed UHECR events is smaller than the that of the astrophysical ojbects which are 
candidate sources. In our previous paper [15], we developed new statistical test methods 
based on the previously used methods and combine them to estimate the significance of 
correlation reliably. The basic idea is that we reduce the two-dimensional distribution of 
arrival directions to the one-dimensional probability distributions, which can be compared 
by using the well-known Kolmogorov-Smirnov (KS) test. We proposed a few reduced one- 
dimensional distributions suitable for the test of correlation between the UHECR arrival 
directions and the point sources of UHECR, which will be restated in detail in Sec. IIHI 
To make the statistical test more definite, we use the simple AGN model for the UHECR 
sources again. This model assumes that a fraction of UHECR above a certain energy cutoff 



are originated from the AGN lying within a certain distance cut. The remaining fraction is 
the isotropic component accounting for the contribution from the sources lying outside of the 
distance cut. The model also assumes, for simplicity, that all selected AGN have the equal 
luminosity and smearing angle of UHECR. For this simple AGN model for UHECR sources, 
our test method showed that the correlation between UHECR in the PAO data released in 
2007 and AGN listed in the 12th edition of Veron-Cetty and Veron (VCV) catalog is much 
stronger than the simple isotropic distribution of UHECR, but also that the correlation is 
not strong enough to support the hypothesis that UHECR are completely originated from 
AGN. 

In this paper, we revisit this for two reasons. Firstly, there appeared the updated data 
sets both for UHECR and for AGN. We use the updated AGN data listed in the 13th 
edition of VCV catalog |lj| and the updated UHECR data reported by PAO in 2010 (a|. 



The 13th edition of VCV catalog published in 2010 is a compilation of all known AGN from a 
variety of catalogs, which contains 133,336 quasars, 1,374 BL Lac objects, and 34,231 active 
galaxies, making a total of 168,941. Especially, the number of objects lying within the GZK 
radius (~ 100 Mpc) which are used for the test of correlation with UHECR is 862, which 
is larger by about 200 than that of the previous version of catalog. PAO also published 
the updated data set in 2010. They released 69 UHECR events collected by the surface 
detector from 2004 January 1 to 2009 December 31. The data have energies above 55EeV 
and zenith angles within 60°. The energy threshold is changed because PAO refined the 
reconstruction algorithms; however, the updated data include all previous UHECR events 
listed in the previous paper. Secondly and more importantly, in the previous paper the 
significance estimation in the statistical test was done in an incorrect way, thus resulted 
in too strong constraints on the simple AGN model. Now, we performed the Monte-Carlo 
simulations to get the correct significance estimations. This results in the significant change 
in the conclusion concerning the isotropy of UHECR events. 

This paper is organized as follows. In section [XXJ, we describe the simple AGN model for 
the UHECR sources and the details needed for the generation of Monte-Carlo events for 
the model and the statistical comparison with the observed data. In section IIIIl we explain 
in detail our statistical methods for comparing two distributions of arrival directions. The 
results of our correlation analysis are presented in Section [IV] and discussion and conclusion 
follow in section [Vj 



II. THE SIMPLE AGN MODEL FOR UHECR SOURCES 

We examine the plausibility of the idea that AGN are the main sources of UHECR through 
the statistical comparison of the arrival direction distribution of observed UHECR data and 
that expected from the AGN source model. To make the implications and the limitations 
of our analysis more definite, we need to clearly state the AGN model for UHECR sources. 
In this section, we describe the details of the simple AGN model for UHECR sources which 
is adopted for the correlation test of AGN and UHECR in this paper. 

For the comparison with the observations, we use the UHECR data with energies higher 
than a certain energy cut E c . We take E c to be higher than the GZK cutoff, -Egzk ~ 40 EeV. 
The advantages of using the high value of energy cut E c for UHECR data are that we can 
minimize the deflection due to the intergalactic magnetic fields and that we can reduce the 
isotropic background contribution. At very high energies, most of the isotropic backbround 
contribution must come from the far distant astrophysical object. By taking the energy cut 
above the GZK cutoff, we can restrict most of possible sources to be within the GZK radius 
which is around 100 Mpc and reduce this contributions. Of course, the disadvantage is that 
the number of UHECR data becomes small, which reduces the statistical power. So we need 
to make a compromise in-between. We use the UHECR data released by PAO in 2010 J5j. 
The released data set contains 69 UHECR with energy higher than 55 EeV. To fully use the 
released data, we take the energy cut E c = 55 EeV. 



For our analysis, we use AGN listed in the 13th edition of VCV catalog [19|. We select 
AGN within a certain distance cut d c . Because we apply the energy cut to UHECR data 
which is higher than the GZK cutoff, most of probable sources of them are expected to lie 
within the GZK radius, tgzk ~ 100 Mpc. Thus, we take d c to be 100 Mpc (corresponding to 
the redshift z < 0.024). The original number of AGN within 100 Mpc in the VCV catalog is 
865. This includes 3 AGN with zero redshift, which are problematic to be included in our 
analysis. Thus, we eliminate these three AGN from our AGN data set and the remaining 
862 AGN will be used in our analysis. Figure [1] shows the distributions of UHECR and AGN 
used in our analysis. 

We consider AGN as smeared point sources of UHECR, incorporating the fact that the 
trajectories of UHECR can be bent by intervening magnetic fields. The smearing effect 
varies AGN by AGN in general. We assume that each AGN has a gaussian flux distribution 




-90° 

FIG. 1. Distribution of the arrival directions of UHECR, represented by black dots (•), with energy 
E > 55EeV reported by PAO in 2010, in the equatorial coordinates plotted using the Hammer 
projection. The solid red line represents the boundaries of the sky covered by PAO experiments. 
The blue crosses (x) represent the locations of AGN with distance d < 100 Mpc taken from the 
13th edition of VCV catalog. The cyan square (■) and the cyan triangle (a) show the locations 
of Centaurus A and Messier 87, respectively. 

with a certain angular width. Then the UHECR flux from all AGN is given by 



*agn(*) = E h eXP ['" ( ^ )Ay2] 
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where Lj is the UHECR luminosity, di is the distance, 9j(r) = cos~ 1 (r • f') is the angle 
between the direction f and the j-th AGN, 9 S j is the smearing angle of the j-th AGN, and 
N(6 S j) = j dQexp[—(8j(r)/8 S j) 2 } is the normalization of smearing function. For small 9 S , 
N(6 S ) « tt9 2 and for large 9 S , N(8 S ) « 4tt. Just for simplicity, we assume that all AGN have 
the same UHECR luminosity, Lj = L, and the same smearing angle, 8 S j = 9 S . The value of 
L will be fixed by the total number of UHECR contributed by AGN. The smearing angle, 
9 S , is taken to be a free parameter, while its fiducial value is taken to be 6° [7|. 

The UHECR with energy above the energy cut E c = 55 EeV still can come from the 
sources lying outside the distance cut d c = 100 Mpc, and we want to take it into account 
in the UHECR source model. We consider that a certain fraction of UHECR with energy 
above E c is originated from the AGN within a distance d c , while the remaining fraction of 
them is from the isotropically distributed background contributions. Thus, the expected 
flux at a given arrival direction f is the sum of two contributions, 



F(r) 
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(2) 



Now we define the AGN fraction Ja to be 

/a ~ ¥ — T7 - ' ( ] 

r AGN + -HSO 

where -Fagn = (4vr) _1 /Fagn(i")^ = L ■ (47r) -2 V ■ d~ 2 is the average AGN-contributed 
flux. Note that the definition of the AGN fraction is somewhat different from that defined 
in our previous work. There, the AGN fraction was defined to be the ratio of AGN-originated 
UHECR after considering the exposure of the detector array. This actual fraction of AGN 
contribution at a given detector is generally different from Ja because it depends on the 
location of the detector relative to the source distribution and on the size of the smearing 
angle. We found that for the PAO site considering the exposure makes the actual AGN 
fraction a little bit smaller than Ja- Now the UHECR flux can be written as 

_47rV,d- 2 exp [- (0, (r)/9 s ) 2 ] _ 

F ' f )=^ Nvl^r +{l ~ h)F ' (4) 

where F = Fagn + -Piso- Out of three parameters L, 9 S , and -Fiso, the AGN fraction j'a 
and the smearing angle 9 S are treated as the free parameters of the model, while the average 
flux F is fixed by the total number of UHECR events. 

If the source distribution is known, the fraction of UHECR with E > E c coming from 
the sources with d < d c can be estimated as a function of E c and d c by solving the cosmic 
ray propagation equation. For the uniform distribution of equal sources, E c = 55 EeV and 
d c = 100 Mpc, the estimated value is around Ja ~ 0.7 [8[, and we will take this value as the 
fiducial value of Ja- 

For the correct comparison of observed arrival directions with the expected ones, we also 
need to take into account the efficiency of the detector as a function of the arrival direction, 
which depends on the location and the characteristics of the detector array. For the UHECR 
with energies above GZK cutoff, considering the geometric efficiency only is good enough. 
It is determined by the location of the detector array and the zenith angle cut. Then the 
exposure function h is a function of the declination 5 only 20[, 

h(8) = — [sin a m cos A cos 5 + a m sin A sin 5} , (5) 

7T 

where A is the latitude of the detector array, 9 m is the zenith angle cut, and 

0, for f > 1, 

cos 9 m — sin A sin 5 



« ra = 7T, for £ < -1, with £ 

cos A cos o 

cos^ 1 £, otherwise 
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FIG. 2. Distributions of the mock UHECR arrival directions (6900 events, represented by small 
red dots) of PAO experiment, obtained from the simple AGN model for two different values of 
AGN fraction /a = 1 (upper panel) and /a = 0.4 (lower panel), with the same smearing angle 
9 S = 6°. Others are same as in Figure 1. 



The latitude of the PAO site is A = —35.20° and the zenith angle cut of the released data is 



Or, 



60°. 



To get the expected distribution from the simple AGN model, we rely on the simulation 
taking into account the exposure function. In Figure [2, we showed the distributions of mock 
UHECR data for two different values of the AGN fraction, f& = 1.0 and j \ = 0.4 with the 
same smearing angle 9 S = 6°. 



III. STATISTICAL COMPARISON OF TWO ARRIVAL DIRECTION DISTRI- 
BUTIONS 

We now describe our statistical methods to measure the plausibility of the UHECR source 
model. What we obtain through statistical analysis is the probability that the observed 
UHECR arrival direction distribution originates from the given UHECR source model. This 
is achieved by statistically quantifying how similar the observed UHECR arrival direction 
distribution is to the expected one from the source model. The correlation studies for this 
kind of point distribution have been done in many branches of science. The statistical 



analysis methods of spatial point pattern are well established 2lN24|. One of the most 
useful methods to compare the point patterns is Ripley's K and L function. The underlying 
concept of this function is that we can characterize the distributions by counting the mean 
number of points of type 1 in a disc of radius r centered at the typical point of type 2. We can 
obtain the Ripley's K function as a function of r, then we can compare the function obtained 
from observed distribution and the theoretically expected one. The comparison method used 
by PAO |4|, |5[ is similar to this method. They count the number of events within the given 
angular distance obtained by their exploratory scan. However, this approach cannot avoid 
the arbitrariness of constraining the radius r. For the appropriate application of Ripley's 
K function, one should count the number of events for all radius r. Therefore, we tried to 
develop the methods which shares the same basic idea, but are simpler and intuitive for the 



spherical data masked by the exposure function. In our previous work [15[, we developed 
comparison methods in which the two-dimensional UHECR arrival direction distributions on 
the sphere is reduced to one- dimensional probability distributions of some sort, so that they 
can be compared by using the standard Kolmogorov-Smirnov(KS) test or its variants such 
as the Anderson-Darling (AD) test and the Kuiper (KP) test. In this section we elaborate 
further on these methods. 

As an illustration of our method, let us consider the distribution of equatorial coordinates, 
Right Ascension (RA) or Declination (DEC) of UHECR. Let us call them RA distribution 
(RAD) and DEC distribution (DECD), respectively. In this case, the reduction is simply for 
RAD: fj = (ttj, 5i) — > ai and for DECD: fj = («j, 5i) — > 5i, where f j are arrival directions of 
UHECR. In Figure |3j we show RAD and DECD of the PAO data and compare them with 
those of the isotropic distribution and of the simple AGN model with fiducial parameters 
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FIG. 3. RA (left panel) and DEC (right panel) distribution of the PAO data, compared to those 
of the isotropic distribution and the simple AGN model with /a = 0.7 and S = 6°. 



Ja = 0.7 and 9 S = 6° described in the previous section. RAD and DECD are normalized as 
the probability distribution by dividing the count by the total number of data, so that they 
sum up to 1. 

Now we can apply the statistical test for one dimensional distribution such as the KS 
test, the AD test, and the KP test to measure how similar the distribution obtained from 
the observed data is to that expected from the model. All these three tests are based on 
the cumulative probability distribution (CPD), Sn(x) = J p(x')dx'. Though we made bins 
for plotting the distribution in Figure [3J you will easily see that these tests do not involve 
binning, as we use the CPD. Each test defines its own statistic. The KS statistic Dks is the 
maximum difference between the CPD of the observed distribution Sn ± (x) and the CPD of 



the expected distribution Sn 2 (x) 



Dks = max ISV, (x) — Smo(x 



»jv 2 ( 



(6) 



The AD statistic -Dad is the weighted statistic 

l-Sjv^a;) - S N2 (x)\ 



-Dad = max 



(7) 



■* y/S N2 (x)(l-S Na (x))' 

The KP statistic -Dkp is the sum of the maximum difference of the observed distribution 



above and below the expected distribution, 

.Dkp = maxfiSjV! (x) — SV 2 (x)] + max[S , Ar 2 (x) — SjVi(x)]. (8) 

X X 

The probability that the observed data are obtained from the model under consideration is 
estimated from the significance level of the statistic. The significance level of the KS statistic 



.Dks is given approximately by the formula 25] 



P KS (D KS \N e ) = QKsQv/iVe + 0.12 + 0.11/v/iVepKs), (9) 

where Qks(A) = 2 £JL 1 (-l) i-1 e- !yaAa and N e = N 1 N 2 /(N 1 + N 2 ) is the effective number of 
data. For the KP statistic -Dkp, the similar approximate formula is also available, 

P(D KP \N e ) = g K p([v / ^+ 0-155 + 0.24/ v/Ay Dkp), (10) 

where Qkp(A) = 2£^ =1 (4j 2 A 2 — l)e~ 2j A . For the AD statistic -Dad, there is no known 
simple formula analogous to Eqs. fl9]) and (TTOj) . We need to rely on the Monte-Carlo simu- 
lations to get the significance level of the AD statistic. Three test methods have their own 
pros and cons. It is known, in general, that the KS statistic is sensitive around the median, 
the AD statistic is sensitive on the tails, and KP statistic has equal sensitivities at all values 
of x. 

Let us apply the KS, KP, and AD tests to compare RAD and DECD of the PAO data 
and those obtained from the model under consideration. For RAD and DECD, the number 
of data in the distribution is same as the number of UHECR data. Thus, Ni = Nq, the 
number of observed UHECR data and A^ = As, the number of mock UHECR data. We 
can make the expected distribution more accurate by increasing the number of mock data 
Ns from the model under consideration. In the limit N$ — > oo, the effective number of data 
is simply A" e = Nq- For the sake of practice calculation, we set N$ = 10 5 . The probabilities 
that RAD and DECD of the PAO data come from the isotropic distribution and the simple 
AGN model with fiducial parameters (/^ = 0.7, 9 S = 6°) are listed in Table 1. One notable 
thing is that AD test gives much smaller probabilities than the other comparison methods 
for RAD method. However, we found that this is a fake caused by the fact that one of 
the PAO data has RA = 0°, the end point of the RA range, by chance and the AD test is 
very sensitive on the tail. It makes -Dad very large and the probability very small. Note 
that RA is actually a cyclic variable on a circle. This fake result can be avoided by shifting 
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Model 


Reduction Method 


Comparison method 


KS 


KP 


AD 


Isotropic Distribution 
/a = 


RAD(0°) 


0.52 


0.33 


< 10~ 5 


RAD (60°) 


0.65 


0.33 


0.93 


RAD (180°) 


0.10 


0.33 


0.18 


DECD 


0.57 


0.39 


0.61 


AGN Model 
f A = 0.7, 6 S = 6° 


RAD(0°) 


0.088 


0.096 


< 10~ 5 


RAD (60°) 


0.22 


0.095 


0.23 


RAD (180°) 


0.088 


0.095 


0.27 


DECD 


0.27 


0.013 


0.44 



TABLE I. The probabilities that RAD and DECD of the PAO data come from the isotropic 
distribution and the simple AGN model with fiducial parameters /a = 0.7 and 9 S = 6°. The angles 
inside the parentheses are the shift angles of RAD end point 



the origin of RA coordinate by an arbitrary amount. In fact, the KS and AD tests are not 
invariant under this shift, so their results are dependent on the amount of shift. The KP 
test is invariant under this cyclic shift. Thus, for RAD, the KS test or the AD test is not 
a good choice and the KP test is a right choice. Overall, both RAD and DECD methods 
indicate that the PAO data are consistent with isotropy, while both methods with the KP 
test reveals that the simple AGN model with fiducial parameters is disfavored. 

The reduction from the two-dimensional distribution to the one-dimensional distribution 
implies the loss of information in the obtained data anyway. However, it is easy and con- 
ceptually transparent to compare and sometimes a good choice of reduction method can 
catch what causes the discrepancy between the observed data and the model prediction. 
RAD or DECD may be good for checking isotropy, but may not be suitable for the study of 
correlation between the UHECR arrival directions and the directions of astrophysical point 
sources such as AGN. We can devise the reduced distributions which are more sensitive 
to the correlation between the sources and UHECR. In the previous paper, we introduced 
three methods: AADD, CADD, and FEVD. Now we focus on CADD and FEVD described 
below, which deal with the correlation between AGN and UHECR directly and thus are 
more relevant in correlation analysis. 
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a. Correlational Angular Distance Distribution (CADD) This is the distribution of the 
angular distances of all pairs UHECR arrival directions and the point source directions: 

CADD: {0 y , = cos- 1 ^ • *£) |i = l,...,7V; j = l,...,M}, (11) 

where f j are the UHECR arrival directions, f '• are the point source directions, and A and M 
are their total numbers, respectively. In Figure 0J we show the concept of CADD schemat- 






nn 



ically. This is also an improvement of previously adopted methods [3|, |4|, |l3|, ll4| and most 
useful when we consider the set of point sources for UHECR. The number of data in CADD 
obtained from A UHECR data is Acadd = AM. This means that the data in CADD 
are not all independently sampled, and the probability formula ([9]) and formula (TTOj) which 
assume the independent sampling of data cannot be used. Therefore, the probability has to 
be directly inferred for the source model in hand through the Monte-Carlo simulations. For 
this purpose, we first form a reference set consisting of a huge number of UHECR events 
generated from the source model. Then, we generate the mock set consisting of the same 
number of UHECR events as the observed data from the model and calculate the KS statis- 
tic .Dks between the reference set and the mock set. Then, we repeat the generation of the 
mock set enough times to get the probability distribution of -Dks- In this way, we infer the 
significance of -D K s,obscrvcd between the reference set and the observed data. 

b. Flux Exposure Value Distribution (FEVD) At a given arrival direction, the expected 
flux value is the product of the UHECR flux expected from the UHECR source model and 
the exposure function of the detector at that direction. FEVD is the distribution of expected 
flux values at UHECR arrival directions: 

FEVD : {F, = F(f J )/ i (f J ) \i = l,...,N}, (12) 

where f - ; are the UHECR arrival directions, A is the total numbers of UHECR, F(ri) and 
/i(?i) are the UHECR flux and the exposure function, respectively. In Figure [51 we show 
the concept of FEVD schematically. It was proposed by Koers and Tinyakov [8| to test the 
correlation between the galaxy distribution and the UHECR. One merit of this method is 
that it can be used for the continuous source distribution, as well as for the point sources. 
The number of data in FEVD is Afevd = A. Thus, the probability formula OH]) and formula 
( fit)]) can be directly used. We confirmed this fact through the Monte-Carlo simulation done 
in the same way as in the CADD case. 
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(a) 



(b) 



FIG. 4. Illustrations showing the basic idea of CADD and its comparison (taken from Figure 1 
in [151]). (a) CADD is the probability distribution of all angular distances between the reference 
(point source) directions (red dots) and the UHECR arrival directions (black dots), (b) When the 
observed UHECR events are more clustered around the reference directions than, say, those of the 
isotropic distribution, the observed CADD has larger probability density at small angles than that 
expected from the isotropic distribution. 

IV. CORRELATION ANALYSIS 



In our previous work [151 ] . we analyzed the correlation between the PAO data released 
in 2007 and the AGN listed in the 12th edition of VCV catalog. In this paper, we update 
the analysis using the PAO data released in 2010 and the 13th edition of VCV catalog. For 
moderate numbers of data, the suitable methods for the analysis of correlation between the 
arrival directions of UHECR and the locations of point sources such as AGN are CADD 
and FEVD. In this paper, we use CADD mainly and FEVD supplementarily to cross-check 
the results and compare the reduction methods. We also emphasize that we correct the 
previous probability calculation for CADD by using the values directly inferred through the 
Monte-Carlo simulations. 

To get the expected distribution from the simple AGN model, we also rely on the sim- 
ulation. Because we use the probability distributions for comparison, we can obtain more 
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(a) 



(b) 



FIG. 5. Illustrations showing the basic idea of FEVD and its comparison, (a) FEVD is the 
probability distribution of the flux times exposure values at the UHECR arrival directions (black 
dots), (b) When the observed UHECR follows the flux predicted by the model more faithfully 
than, say, that of the isotropic distribution, the observed FEVD has larger probability density at 
high flux values than that expected from the isotropic distribution. 

accurate expected distribution by increasing the number of mock UHECR data from the 
model. For the reason of practical computation, we use N$ = 10 5 which give the accuracy 
sufficient for our purpose within reasonable computation time. 

To understand how the discrepancy between the distribution obtained from the data and 
that from the model occurs, it is helpful to examine the cumulative probability distribution 
(CPD) directly and check the position where the KS, KP, or AD statistic is obtained. In 
Figure we show CPD of CADD of the PAO data and of three cases of our interest, the 
simple AGN model with /a = (completely isotropic distribution), f A = 1 and 6 S = 6° 
(complete AGN origination with a small smearing angle), and Ja = 0.4 and 9 S = 6° (the 
best fit model for a smearing angle 8 S = 6). We note that the stronger the correlation 
between UHECR and AGN is, the more UHECR lie at small angular distances from AGN, 
resulting in steeper rise of CPD at small angles. Thus, the small angle region of CPD in 
Figure [6] reveals that the PAO data have stronger correlation with AGN than the completely 
isotropic distribution, but the correlation is not strong enough to be consistent with the case 
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FIG. 6. The cumulative probability distributions of CADD of AGN and UHECR. The solid black 
is for the PAO data, the dashed green line for the isotropic distribution, the dashed blue line and 
the dotted red line for the simple AGN model with the smearing angle 6 S = 6° and the AGN 
fraction Ja = 1 and Ja = 0.4, respectively. The vertical bars show the positions and sizes of the 
KS statistic D. The numbers in two right columns in the legend are the value of KS statistic and 
the probability that the distribution of PAO data is obtained from the model specified. 

of complete AGN origination with a small smearing angle (6 S = 6°). The probabilities for 
these three cases obtained by CADD method and using the KS, KP, and AD tests are shown 
in Table 2. All three test methods give the consistent results. Therefore, we will provide the 
probability obtained by KS test only from now on. So far, we have dealt with the KS, KP, 
and AD tests; however, the CADD and FEVD methods do not have the circular variable 
problem unlike KS test for RAD. Also, because KS test is widely used for the correlation 
study of UHECR arrival direction, one can compare the KS probabilities provided in each 
paper directly without regards for the different test methods. Hence, the probabilities in this 
paper are calculated by KS test from now on. Overall, the probabilities given by the CADD 
method indicate that the PAO data are marginally consistent with the isotropy (P = 0.11) 
but rule out the complete AGN origination with a small smearing angle (P < 10~ 5 ). 
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AGN Model 


CADD / Comparison Method 


/a B a 


^KS 


Pkp 


Pad 


(Isotropic) 


0.11 


0.036 


0.11 


0.4 6° 


0.57 


0.52 


0.76 


1 6° 


< 10~ 5 


< 10~ 5 


0.0040 



TABLE II. Probabilities that the CADD of the PAO data is obtained from the given models, 
estimated by three different comparison methods, the KS, KP, and AD tests. 
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FIG. 7. AGN fraction (/a) and smearing angle (9 S ) dependence of probabilities by CADD (left 
panel) and FEVD (right panel) methods for the PAO data. Grey scale represents the sigma level 
of each case. 



Because CPD of CADD shows that the observed PAO distribution lies between the 
isotropic distribution and complete AGN origination with a small smearing angle, we expect 
that decreasing the AGN fraction /a (that is, adding more isotropic component) or increas- 
ing the smearing angle 8 S may improve the probability. In Figure we show Ja and 9 S 
dependence of probabilities by CADD and FEVD methods for the PAO data. Both CADD 
and FEVD methods rule out AGN dominance with small smearing angles (/a ^0.7 and 
9s ^S 6°, the lower right corner of the plot). FEVD gives stronger constraint for small smear- 
ing angles. It strongly disfavors AGN dominance even for moderately large smearing angles. 
However, for large smearing angles and small AGN fraction where the distribution tends to 
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FIG. 8. AGN fraction (/a) dependence of probabilities at a smearing angle 9 S = 6° by CADD and 
FEVD methods. 



be isotropic, FEVD becomes rather insensitive and CADD gives the stronger constraint. For 
AGN dominance (Ja > 0.7) to be compatible with the PAO data, the rather large smearing 
angle (8 S > 30°) is required. The PAO data are found to be marginally consistent with the 
isotropy by both CADD and FEVD methods, even though the complete isotropy is not the 
best fit to the PAO data by both methods. 

Now, let us examine the results for the fiducial values of two parameters. In Figure El 
we plot the AGN fraction (Ja) dependence of probability for the fiducial value of smearing 
angle 9 S = 6°. The probability by CADD reaches the maximum at Ja — 0.37, that by FEVD 
& t I'a — 0.16. For this small value of smearing angle, FEVD gives stronger constraints than 
CADD and the consistency with the PAO data requires low AGN fraction (Ja ^ 0.4) and 
large isotropic background. Compared to the result for 2007 PAO data, the best fit value of 
Ja is reduced by 0.1. In Figure El we plot the smearing angle (9 S ) dependence of probability, 
for the fiducial value of AGN fraction Ja = 0.7. For the PAO data to be consistent with the 
simple AGN model for this fixed value of the AGN fraction, the rather large smearing angle 
is required. 
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FIG. 9. Smearing-angle (0 S ) dependence of probabilities at an AGN fraction Ja = 0.7 by CADD 
and FEVD methods. 



So far, our hypothesis assumed that all AGN listed in the catalog are equal sources of 
UHECR with E > E c . One trouble we face concerning this fact is that the number of 
available UHECR data is smaller than the number of AGN. Thus, all AGN cannot be the 
actual sources of UHECR we consider. We took the view that the UHECR luminosity of 
AGN is small and the randomly chosen subset of AGN is responsible for observed UHECR. 
The other plausible possibility is that a certain subset of listed AGN is the genuine source of 
UHECR and the others are not. To make this hypothesis more concrete, we need to further 
classify AGN in some way and narrow down the source candidates among them. Toward this 
purpose, some people have tried the idea that the UHECR comes from AGN accompanied 
by the strong radiation in X-ray or 7-ray range [5|, |26f|32| . 

In our previous analysis, we tried the simple geometrical classification based on distance 
binning and this leaded to the rather interesting result that AGN residing in the distance 
range 40 — 80 Mpc shows striking correlation with the PAO UHECR data. Thus, we per- 
form the same analysis again. Figure [10] shows that the correlation probabilities between 
PAO UHECR data and the simulation data which are obtained by assuming that the AGN 
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FIG. 10. Probabilities for the PAO data from the simple AGN model with AGN in the distance 
range of 20 Mpc bins. For the AGN fraction, the fiducial value /a = 0.7 is used and for the 
smearing angle, the moderate value 9 S = 10° is used. 



residing in the each distance range are responsible for the UHECR. In this case, we set 
the AGN fraction J'a = 0.7 and smearing angle 9 S = 10°. The addition of new data weak- 
ens the correlation; however, CADD and FEVD have good correlation probabilities in the 
distance range 60 — 80 Mpc. We can see the similar distributions between PAO UHECR 
data distribution and AGN in the distance range 60 — 80 Mpc visually in Figure [HJ The 
analogous result was reported by Ryu et al. [331 ] . They measured the separation angles S 
between UHECR of the 2007 PAO data and their nearest AGN in the 12th edition of VCV 
catalog, then plotted S versus the distance of the correlated AGN. Rather independently 
of S, the correlated AGN are concentrated in 40 — 60 Mpc distance range (See Figure 5 in 
33J.). This is consistent with the above result. We do not have a reasonable explanation for 
this correlation yet. However, this correlation can possibly be interpreted as the imprint of 
the large scale structure of the universe or as an indication that a certain subclass of AGN 
is the genuine source of UHECR. 
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FIG. 11. Distributions of AGN in the 60 — 80Mpc distance range and the arrival directions of 
UHECR with E > 55EeV observed by PAO. 

V. DISCUSSION AND CONCLUSION 



The PAO firstly reported the correlation between AGN and UHECR in 2007 3|, |4j . They 
found 20 out of 27 UHECR events with energies above 57 EeV are correlated with at least 
one of the 442 AGN within the distance 71 Mpc listed in the 12th edition of VCV catalog 
when they fixed the correlation angular distance to be ip = 3.2°. In the updated paper 
published in 2010 5], the energy threshold was modified from 57 EeV to 55 EeV due to the 
energy calibration and the other parts of correlation test method remained same. They 
divide the 69 UHECR data with energies above 55 EeV detected from 1 January 2004 to 
31 December 2009 into three periods. Using the data of Period 1, they set up the three 
parameters, the distance cutoff for AGN d c < 75 Mpc, the energy threshold for UHECR 
55 EeV, and the correlation angular distance ip = 3.1°, through the exploratory scan and 
minimizing the chance probability that the observed UHECR events come from the simple 
isotropic distribution. These parameters are applied to other data sets and the correlations 
between UHECR and AGN are tested. As a result, 17 out of 27 events are correlated with the 
AGN and the degree of correlation, which is defined to be the fraction of correlated events, 
Pdata = 0.63 is obtained by using the data presented in 2007 paper (Period 1 + Period 2). 
When the updated data are used, 29 out of 69 events are located within the correlation 
angular distance, therefore the degree of correlation is reduced to pdata = 0.42. For more 
strict examination, the data used in the exploratory scan need to be excluded. When only 
the data detected during Period 2 and Period 3 are used, 21 out of 55 events are correlated 
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and the degree of correlation is reduced further to pdata = 0.38. If the isotropic distribution 
is assumed, the number of expected correlated events is 11.6 and the probability of finding 
such a correlation by chance is P = 0.003. (See the Table 1 in [5].) This means that the 
updated PAO data say that the distribution of UHECR is neither completely isotropic nor 
correlated with AGN very strongly. 



However, as we noted in the previous paper 15[, PAO's method is not sufficient to prove 
the correlation between AGN and UHECR. For the correlation test, our test methods are 
more direct and informative. The change in the results obtained from our test methods for 
the 2007 PAO data and for the updated data in 2010 seem to be consistent with that of PAO's 
method. Let us look into the details in terms of the best probability. The previous results 
of AGN fraction scan (the smearing angle 8 S = 6°) had the best probability at /a = 0.45 
in the case of CADD and at /a = 0.42 in the case of FEVD; however, when the updated 
data are used, CADD has the maximum at /a = 0.37 and FEVD has /a = 0.16 (See the 
Figure 8.). In the case of smearing angle scan, (the AGN fraction is fixed as f\ = 0.7) the 
best smearing angle which have the maximum probability shifts from 6 S = 36° to 9 S = 46° 
for the CADD and from 6 S = 45° to 6 S = 168° for the FEVD (See the Figure 9.). This means 
that the AGN model needs more isotropic component to describe the UHECR distribution, 
and this is consistent with the results of PAO. We can interpret that the updated data are 
more isotropic than the previous data. 

We used the 13th edition of VCV catalog for the AGN information. However, the VCV 
catalog is an incomplete one in the sense that it is not a catalog obtained from a single 
observational mission and it does not cover the full sky completely. Therefore, it has a 
certain limitation to use the VCV catalog for the correlation test. PAO also mentioned this 
point and they considered the incompleteness of VCV catalog in the galactic plane region. 
There are 9 UHECR events within ±10° from the galactic plane. When they exclude these 
data to calibrate the incompleteness of the galactic plane region, the correlation is increased 
from pdata = 0.38 to pdata = 0.46, i.e. 21 out of 46 are within the correlation angular window. 
It is hard to say that the results are statistically significant. When we apply this approach 
to CADD, the best value of the AGN fraction is increased slightly to /a = 0.41, while the 
best value of the smearing angle, 6 S = 45°, is similar to the result for the whole data set. We 
cannot see the significant effect of incompleteness in the galactic plane region at this step. 
Also, we cannot confirm that these results are caused by the incompleteness of catalog or by 
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the deflection due to the strong magnetic field in galactic plane region. These possibilities 
need to be explored further. 

Our analysis assumed that all AGN have the same UHECR luminosity for simplicity. 
Thus, the relatively close AGN dominates over the others in the UHECR flux. This fact can 
be seen in the upper panel of Figure [2j where many red dots representing mock UHECR are 
clustered around Centaurus A (Cen A) and Messier 87 (M87) which are two representative 
close objects in VCV catalog. If we look at the observed PAO UHECR data, the black 
dots representing observed UHECR are actually clustered around Cen A. This supported 
the strong correlation between UHECR and AGN reported by the PAO collaboration. On 
the other hand, no such clustering is seen around M87 which is one of the brightest galax- 
ies in the Virgo cluster. This discrepancy may be a main cause for our main result that 
both CADD and FEVD methods rule out AGN dominance with small smearing angles. 
Zaw et al. suggested that this lack of observed UHECR around M87 can be explained by 



considering the bolometric luminosity [34|. They investigated the bolometric luminosity of 
AGN which are correlated with PAO UHECR (with the criteria for correlation as in J3() 
and determined the empirical lower bound of bolometric luminosity Lboi = 5 x 10 42 erg s _1 
for UHECR production. There are many AGN with bolometric luminosity lower than the 
empirical lower bound, i.e. low-luminosity AGN (LLAGN), in the Virgo cluster in the VCV 
catalog and LLAGN do not have enough power to accelerate UHECR under the conventional 
AGN UHECR acceleration model. Therefore, this can be a possible reason for the UHECR 
deficiency near the Virgo cluster. Statistical tests using CADD and FEVD for the AGN 
model with types or luminosity taken into account would be good future works in tracing 
the UHECR origin. 

In conclusion, we reexamined the correlation between UHECR and AGN using the up- 
dated data sets: for UHECR, we used 69 events with energy E > 55EeV released in 2010 
by PAO and for AGN, we used 862 AGN within the distance d < 100 Mpc listed in the 13th 
edition of VCV catalog. To make the test hypothesis definite, we built up the simple AGN 
model in which UHECR are originated both from AGN, with the fraction f^, and from 
the isotropic background. We treated all AGN as equal sources of UHECR. We also intro- 
duced the smearing angle 9 S to incorporate the effects of galactic and extragalactic magnetic 
fields. Then we compared the arrival direction distributions observed by PAO and expected 
from the model by CADD and FEVD methods. These methods reduce the two-dimensional 
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arrival direction distribution to one dimensional probability distribution which reflect the 
correlation between UHECR and their source candidates so that we can apply the standard 
KS test and calculate the chance probability that the observed distribution comes from the 
model. 

Our results show that both CADD and FEVD methods rule out the AGN dominance 
model with a small smearing angle (Ja > 0.7 and 6 S < 6°). Concerning the isotropy, CADD 
shows that the distribution of PAO data is marginally consistent with isotropy. The best fit 
model lies around the AGN fraction Ja = 0.4 and the moderate smearing angle 9 S = 10°. 
For the fiducial value Ja = 0.7, the best probability of CADD was obtained at a rather 
large smearing angle 9 S = 46°. In short, our results imply that for the whole AGN to be 
viable sources of UHECR, either appreciable amount of additional isotropic background 
or the large smearing effect is required. This situation for AGN as UHECR sources can be 
improved by narrowing down the UHECR sources from the whole AGN to a certain subclass 
of AGN. We tried the distance binning as an illustration and found that the AGN residing 
in the distance range 60 — 80 Mpc have a good correlation with the updated PAO data. This 
good correlation may be a happening by chance, but may also be an indication that the 
large scale structures surrounding AGN can be important for the production of UHECR. In 
this regard, the research on the possibility that the subclass of AGN is 
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